Skip to content

Conversation

mengniwang95
Copy link
Contributor

Type of Change

example

Description

Add dlrm_v2 CPU FP8 QDQ example
depend on #2238

fp32: 0.8031
FP8: 0.8080

@mengniwang95 mengniwang95 requested review from xin3he and thuang6 July 22, 2025 06:45
@chensuyue chensuyue added this to the 3.5 milestone Jul 22, 2025
return self.crossnet(concat_dense_sparse)


class IPEX_DLRM_DCN(DLRM_DCN):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why name it IPEX_XXX, I don't see any dependency to IPEX. better rename it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

model = construct_model(args)
model.model.sparse_arch = model.model.sparse_arch.bfloat16()

qconfig = FP8Config(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know Linear is in default quantization op list. how does EmbeddingBag include in quantization op list? we add it to default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, currently cpu device supports Conv, Linear and EmbeddingBag fp8 quant by default.

PATCHED_MODULE_TABLE["cpu"].update({"Linear": ModuleInfo("linear", PatchedLinear),

@XuehaoSun XuehaoSun merged commit 1ab2011 into master Jul 25, 2025
11 checks passed
@XuehaoSun XuehaoSun deleted the mengni/dlrmv2 branch July 25, 2025 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants